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ABSTRACT 

Stellar radial velocity (RV) measurements have proven to be a very successful method for 
detecting extrasolar planets. Analysing RV data to determine the parameters of the extrasolar 
planets is a significant statistical challenge owing to the presence of multiple planets and vari- 
ous degeneracies between orbital parameters. Determining the number of planets favoured by 
the observed data is an even more difficult task. Bayesian model selection provides a mathe- 
matically rigorous solution to this problem by calculating marginal posterior probabilities of 
models with different number of planets, but the use of this method in extrasolar planetary 
searches has been hampered by the computational cost of the evaluating Bayesian evidence. 
Nonetheless, Bayesian model selection has the potential to improve the interpretation of ex- 
isting observational data and possibly detect yet undiscovered planets. We present a new and 
efficient Bayesian method for determining the number of extrasolar planets, as well as for 
inferring their orbital parameters, without having to calculate directly the Bayesian evidence 
for models containing a large number of planets. Instead, we work iteratively and at each it- 
eration obtain a conservative lower limit on the odds ratio for the inclusion of an additional 
planet into the model. We apply this method to simulated data-sets containing one and two 
planets and successfully recover the correct number of planets and reliable constraints on the 
orbital parameters. We also apply our method to RV measurements of HD 37124, 47 Ursae 
Majoris and HD 10180. For HD 37124, we confirm that the current data strongly favour a 
three-planet system. We find strong evidence for the presence of a fourth planet in 47 Ursae 
Majoris, but its orbital period is suspiciously close to one year, casting doubt on its validity. 
For HD 10180 we find strong evidence for a six-planet system. 
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^ . 1 INTRODUCTION niques (see e.g. lMackay||2003l) have made it possible for Bayesian 

techniques to be applied to extrasolar planetary s e arches (see e.g. 



Extrasolar planetary research has been revitalised in the last decade 
and so far more than 500 extrasolar planets have been discovered. 
Improvements in the accuracy of RV measurements have made it 
possible to detect planets with larger orbital periods and smaller 



Gregory 20051: lFordl2005l : iFord & Gregorvl 120071 ; iBalan & Lahavl 
2009). Bayesian methods have several advantages over traditional 



methods, for example when the data do not cover a complete or- 
bital phase of the planet. Bayesian inference also provides a rig- 
velocity amplitudes. With the flood of new data, more powerful r c . , , , .. , ■ , ■ j.j 

J r r orous way ot performing model selection which is required to de- 

statistical techniques are being developed and applied to extract ... , „ . c , , , , „, . , , 

cide the number ot planets favoured by the data. The main problem 
as much information as possible. Traditionally, planet parameters . , . , . , , , ■ , ■ ■ , 

in applying such Bayesian model selection techniques is the com- 
and their uncertainties were obtained by searching for periodicity ^ .. . . . . , . , , .. „ . . , , 

J ° i , || ~ A putational cost involved in calculating the Bayesian evidence (see 

in the RV data using the Lomb-Scargle penodogram dLomblll97q ; S [2]l " 

IScargld[T982h to fix the orbital period and then estimating other 

parameters by using minimisation algorithms. IClvde et ail d2007l) recently reviewed the state of tech- 

Recent advances in Marko-Chain Monte Carlo (MCMC) tech- niques for mod e l sele ction from a statistical perspective and 

IFord & Gregorvl d2007l) evaluated the performance of a variety 
of marginal li kelihoo d estimators in the extrasolar planet con- 
* E-mail: f.feroz@mrao.cam.ac.uk text. iGregorvl {2007b) found good agreement (within 28%) be- 



© 2010 RAS 



2 F. Feroz, S.T. Balan and M.P. Hobson 



tween three estimators: (a) parallel tempering, (b) the ratio esti- 
mator, and (c) Restricted Monte Carlo (RMC) for one and two 
planet models. However, for a 3 planet model the three estima- 
tors di verged significantly with the RMC yielding the lowest es- 
timate. Greg ory & Fischej [20 1 G ) introduced the Nested Restricted 
Monte Carlo (NMRC) estimator, an improvement on the RMC es- 
timator. The NRMC estimator is expected to provide a conserva- 
tive lower bound on the Bayesian evidence in higher dimensions. 
These Bayesian model selection techniques have already resulted 
in the di scovery of previously un known planets in existing data- 
sets, e.g. lTuomi & Kotiranta (2009) di scovered a second planet or- 
biting HP 11506 and iGregorv & Fischer! < |2010|) reported a third 
planet orbiting 47 Ursae Majoris using Bayesian analysis. Never- 
theless, most of the Bayesian model selection techniques employed 
so far in extrasolar planetary searches have relied on estimates of 
the Bayesian evidence, with uncertain accuracy. Our aim in this pa- 
per is to present a new and efficient method for Bayesian model se- 
lection to determine the number of planets favoured by the data, and 
estimate their parameters, without having to calculate directly the 
Bayesian evidence for models containing a large number of planets. 

The outline of this paper is as follows. We give a brief in- 
troduction to Bayesian inference in Sec. [2] and describe various 
Bayesian object detection techniques in Sec. [3] Our model for cal- 
culating radial velocities is described in Sec. [4] In Sec. [5] we de- 
scribe our Bayesian analysis methodology including the descrip- 
tions of likelihood and prior probability functions. We apply our 
method to simulated data in in Sec. [6] and to real RV data sets on 
HD 37124, 47 Ursae Majoris and HD 10180 in Sec.[7] Finally our 
conclusions are presented in Sec. [8] 



2 BAYESIAN INFERENCE 

Our planet finding methodology is built upon the principles of 
Bayesian inference, and so we begin by giving a brief summary 
of this framework. Bayesian inference methods provide a consis- 
tent approach to the estimation of a set of parameters © in a model 
(or hypothesis) H for the data D. Bayes' theorem states that 

Pr(e|D , HH MH|^||£(5H), m 

where Pr(0jD,i/) = P(0) is the posterior probability distri- 
bution of the parameters, Pr(D|@, H) = £(©) is the likelihood, 
Pr(0|#) = 7r(0) is the prior, and Pr(D|//) = Z is the Bayesian 
evidence. 

In parameter estimation, the normalising evidence factor is 
usually ignored, since it is independent of the parameters ©, and 
inferences are obtained by taking samples from the (unnormalised) 
posterior using standard MCMC sampling methods, where at equi- 
librium the chain contains a set of samples from the parameter 
space distributed according to the posterior. This posterior consti- 
tutes the complete Bayesian inference of the parameter values, and 
can be marginalised over each parameter to obtain individual pa- 
rameter constraints. 

In contrast to parameter estimation problems, for model se- 
lection the evidence takes the central role and is simply the factor 
required to normalize the posterior over ©: 



Z 



£(0)7r(0)d°0, 



(2) 



A lni?| 


Odds 


Probability 


Remark 


< 1.0 


< 3 : 1 


< 0.750 


Inconclusive 


1.0 


~ 3 : 1 


0.750 


Weak Evidence 


2.5 


~ 12 : 1 


0.923 


Moderate Evidence 


5.0 


~ 150 : 1 


0.993 


Strong Evidence 



Table 1. The scale we use for the interpretation of model probabilities. 



a model if more of its parameter space is likely and smaller for a 
model with large areas in its parameter space having low likelihood 
values, even if the likelihood function is very highly peaked. Thus, 
the evidence automatically implements Occam's razor: a simpler 
theory with compact parameter space will have a larger evidence 
than a more complicated one, unless the latter is significantly bet- 
ter at explaining the data. The question of model selection between 
two models Ho and H\ can then be decided by comparing their 
respective posterior probabilities given the observed data set D, as 
follows 



R = 



Pr(#ijD) _ Pr(Dj#i)Pr(#i) _ Z x Pr(#i 



Pr(#o|D) Pr(DiH )Pr(#o) Z Pt(H ) 



(3) 



where Pt(Hi)/ Pt(Hq) is the a priori probability ratio for the two 
models, which can often be set to unity but occasionally requires 
further consideration. The natural logarithm of the ratio of poste- 
rior model probabilities (sometimes termed the posterior odds ra- 
tio) provides a useful guide to what constitutes a significant differ- 
ence between two models: 



A In R = In 



'Pr(#i 


D)" 


= In 


~Z X Pr(Hx)" 


|Pr(#o 


D). 


_£ Pr(#o)J 



(4) 



where D is the dimensionality of the parameter space. As the av- 
erage of the likelihood over the prior, the evidence is larger for 



We summarize the convention usually used for model selection in 
Tabled 

Evaluation of the multidimensional integral in Eq. [2] is a 
challenging numerical task. Standard techniques like thermody- 
namic integration are extremely computationally expensive which 
makes evidence evaluation at least an order of magnitude more 
costly than parameter estimation. Some fast approximate meth- 
ods have been used for evidence evaluation, such as treating the 
posterior as a multivariate G aussian centred at its peak (see e.g. 
iHobson & McLa chlan 2003), but this approximation is clearly a 
poor one for multimodal posteriors (except perhaps if one performs 
a separate Gaussian approximation at each mode). The Savage- 
Dickey density ratio has also been proposed (see e.g. lTrottall2007h 
as an exact, and potentially faster, means of evaluating evidences, 
but is restricted to the special case of nested hypotheses and a 
separable prior on the model parameters. Various alternative infor- 
mation crite ria for astrophysical model selection are discussed by 
Liddle ( 2007), but the evidence remains the preferr ed method. 

The nested sampling approach, introduced bv lSkillingl J2004T) . 
is a Monte Carlo method targeted at the efficient calculation of 
the evid ence, but also produce s p osterior infe r ences as a by- 
product. iFeroz & Hobson! ( 12008!) and lFeroz etall d2009bh built on 
this nested sampling framework and have recently introduced the 
MultiNest algorithm which is very efficient in sampling from 
posteriors that may contain multiple modes and/or large (curving) 
degeneracies and also calculates the evidence. This technique has 
greatly reduces the computational cost of Bayesian parameter es- 
timation and model selection and has already been ap plied to sev- 
eral model se lections problem in astrophysics (see e.g. lFeroz et al.l 
2008 j2009dlah . We employ this technique in this paper. 
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3 BAYESIAN OBJECT DETECTION 

To detect and characterise an unknown number of objects in a 
dataset the Bayesian purist would attempt to infer simultaneously 
the full set of parameters O = {-ZVobj, ©i, ©2, • •• , ©Jv ob j : ©n}, 
where A bj is the (unknown) number of objects, Gi are the pa- 
rameters values associated with the ith object, and © n is the set of 
(nuisance) parameters common to all the objects. In particular, this 
approach allows for the inclusion of an informative prior (if avail- 
able) on iV bj . The crucial complication inherent in this approach, 
however, is that the dimensionality of parameter space is variable 
and therefore the analysis method should be able to move between 
spaces of different dim e nsion ality. Such techniques are discussed in 
IHobson & McLachlanl J2003I) . Nevertheless, due to this additional 
complexity of variable dimensionality, the techniques are generally 
extremely computationally intensive. 

An alternative and algorithmically simpler approach for 
achieving virtually the same result 'by hand' is instead to con- 
sider a series of models -Hjv obj , each with a fixed number of ob- 
jects, i.e. with iVobj = 0, 1, 2, . . .. One then infers N t, s by iden- 
tifying the model with the largest marginal posterior probability 
Pr(HM obi |D). The probability associated with iVobj = is often 
called the 'null evidence' and provides a baseline for comparison 
of different models. Indeed, this appr oach has been adopted pr e- 
viously in exoplanet studies (see e.g. iGregorv & Fische 
albeit using only lower-bound estimates of the Bayesian evidence 
for each model. Assuming that there are n p parameters per object 
and n n (nuisance) parameters common to all the objects, for A^bj 
objects, there would be A bjn p + n n parameters to be inferred, 
Thus, the dimensionality of the problem and consequently the vol- 
ume of the parameter space increases almost linearly with A^j. 
Along with this increase in dimensionality, the complexity of the 
problem also increases due to exponential increase in the number 
of modes as a result of counting degeneracy, e.g. for A'obj = 2 and 
© = {©i, ©2, ©n} where 6 1 and ©2 are the parameters values as- 
sociated with first and second objects respectively and © n is the set 
of nuisance parameters, one would get the same value for the like- 
lihood £(©) by just rearranging as {02 , ©1 , n } and therefore 
there should at least be twice as many modes for A'obj = 2 than 
for A bj = 1. Similarly there are n! more modes for A Q bj = n 
than for A bj = 1. This increase in dimensionality and severe 
complexity of the posterior makes it very difficult to evaluate the 
Bayesian evidence, even approximately. In exoplanet analyses, we 
have found that MultiNest is typically capable of evaluating the 
evidence accurately for systems with up to 3 planets. If 4 or more 
planets are present, MultiNest still maps out the posterior distri- 
bution sufficiently well to obtain reliable parameter estimates, but 
can begin to produce inaccurate evidence estimates. Thus, even this 
approach to Bayesian object detection is of limited applicability in 
exoplanet studies. 

If the contributions to the data from each object are reason- 
ably well separated and the correlations between parameters across 
objects is minimal, one can use the alternative approach of setting 
A Q bi = 1 (see. e.g. IHobson & McLachlaJl2003f ; lFeroz & HobsorJ 
2008) and therefore the model for the data consists of only a sin- 
gle object. This does not, however, restrict us to detecting only 
one object in the data. By modelling the data in such a way, we 
would expect the posterior distribution to possess numerous peaks, 
each corresponding to the location of one of the objects. Conse- 
quently the high dimensionality of the problem is traded with high 
multi-modality in this approach, which, depending on the statistical 
method employed for exploring the parameter space, could poten- 



tially simplify the problem enormously. For an application of this 
app roach in detecting galaxy cluster from weak lensing data-sets 
see IFeroz et al.1 J2008h - Unfortunately, for extrasolar planet detec- 
tion using RV, this approach cannot be utilized as the nature of data 
itself makes the parameters of different planets in multi-planet sys- 
tem correlated. 

We therefore propose here a new general approach to Bayesian 
object detection that is applicable to exoplanet studies, even for sys- 
tems with a large number of planets. Motivated by the fact that, as 
discussed above and in Sec. [2] evaluation of the evidence integral 
is a far more computationally demanding procedure than parameter 
estimation, we consider a method based on the analysis of residuals 
remaining after detection and subsequent inclusion in the model of 
A'obj objects from the data, as outlined below. In what follows, we 
will simply assume that the prior ratio in Eq.|4]is unity, so that the 
posterior odds ratio R coincides with the evidence ratio. In princi- 
ple, however, one could adopt a more informative prior ratio given 
a theory of planet formation that predicted the probability distribu- 
tion for the number of planets. 

Our approach to Bayesian object detection is as follows. Let us 
first denote the observed (fixed) data by D = {di, £fe, • • ■ , divi}, 
with the associated uncertainties being {en., cr2, • • ■ , o"m}- In the 
general case that A Q bj = n, let us define the random variable D„ 
as the data that would be collected if the model H n were correct, 
and also the random variable R n = D D n , which are the data 
residuals in this case. If we set A Q bj = n and analyse D to obtain 
samples from the posterior distribution of the model parameters 0, 
using MultiNest, then from these samples it is straightforward to 
obtain samples from the posterior distribution of the data residuals 
R n . This is given by 

Pr(R„|D, H n ) = f Pr(R„|0, H n ) Pr(0|D, H n ) d@, (5) 

where 

Pr(R IO H ) - FT 1 r::n f [A " R ' = A>.i(e)] 2 ) 
Pr(R„ |0, H n ) 11 -j= exp j - f j , 

(6) 

and D p (0) is the (noiseless) predicted data-set corresponding to 
the parameter values 0. It should be noted that $5$ and @ contain 
no approximations. In principle, one could then perform a kernel 
estimation procedure on the samples obtained to produce a (pos- 
sibly analytic) functional form for Pr(R n |D, H). For simplicity, 
we assume here that the residuals are independently Gaussian dis- 
tributed with a mean (R n ) = {ri, r2, • • • , ?*m} an d standard devi- 
ations {a[ , a' 2 , ■ ■ ■ , a' M } obtained from the samples; we find that 
this is a good approximation. 

These residual data (R 7 i), with associated uncertainities, can 
then be analysed with A bj = 0, giving the 'residual null evidence' 
Z T .o, which is compared with the evidence value Z It i obtained by 
analysing (R„) with A Q bj = 1. We denote the natural logarithm of 
the evidence ratio Z It i/Z T .o between these two models by A In Z T . 
We are thus comparing the model Ho that the residual data does not 
contain an additional planet to the model Hi in which an additional 
planet is favoured. 

Our overall procedure is therefore as follows. We first set 
A bj = 1 and analyse the original data set D. If, in the analysis 
of the corresponding residuals data, Hi is favoured over Ho, then 
the original data D are analysed with A bj = 2 and the same pro- 
cess is repeated. In this way, A D bj is increased in the analysis of the 
original data D, until Ho is favoured over Hi in the analysis of the 
corresponding residual data. The resulting value for A bj gives the 
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number of objects favoured by the data. This approach thus only re- 
quires the Bayesian evidence to be calculated for iV bj = 1 model 
(and the JV bj = model, which is trivial); this reduces the com- 
putational cost of the problem significantly. Moreover, in principle, 
this procedure is exact. The only approximation made here, for the 
sake of simplicity, is to assume that Pr(R„|D, H n ) takes the form 
of an uncorrected multivariate Gaussian distribution. 

In adopting this approach, our rationale is that, if the n-planet 
model is correct, the corresponding data residuals R„ should be 
consistent with instrumental noise, perhaps including an additional 
stellar jitter contribution (see Section l5Tt . In this case, the null hy- 
pothesis, Ho, should be preferred over the alternative hypothesis, 
Hi, since the latter supposes that some additional signal, not con- 
sistent with noise, is present in the data residuals. If Hi is preferred, 
we take this as an indication of further planet signal(s) present in 
the data, and therefore re-analysis the original dataset D using an 
(n + 1) -planet model. In this way, we circumvent the problem that 
the inclusion of an additional planet to an n-planet model will in- 
evitably affect the best-fit parameters for the original n-planet sub- 
set. 



4 MODELLING RADIAL VELOCITIES 

It is extremely difficult to observe planets at interstellar distances 
directly, since the planets only reflect the light incident on them 
from their host star and are consequently many times fainter. 
Nonetheless, the gravitational force between the planets and their 
host star results in the planets and star revolving around their com- 
mon centre of mass. This produces doppler shifts in the spectrum 
of the host star according to its RV, the velocity along the line-of- 
sight to the observer. Several such measurements, usually over an 
extended period of time, can then be used to detect extrasolar plan- 
ets. 

Following the formalism given in lBalan & Lahavl d2009l) . for 
N p planets and ignoring the planet-planet interactions, the RV at 
an instant i; observed at j'th observatory can be calculated as: 

iV p 

v(t u j) = V) - K p [sin(/i, p + zd p ) + e p sin(ro p )] , (7) 
p=i 



where 

Vi -- 



Xp 



systematic velocity with reference to jth observatory, 

velocity semi-amplitude of the pth planet, 

longitude of periastron of the pth planet, 

true anomaly of the pth planet, 

orbital eccentricity of the pth planet, 

orbital period of the pth planet, 

fraction of an orbit of the pth planet, prior to the 
start of data taking, at which periastron occurred. 



Note that /i jP is itself a function of e p , P p and Xv While there 
is unique mean line-of-sight velocity of the center of motion, it is 
important to have a different velocity reference Vj for each obser- 
vatory/spectrograph pair, since the velocities are measured differ- 
entially relative to a reference frame specific to each observatory. 

We also model the intrinsic stellar variability s ('jitter'), as a 
source of uncorrelated Gaussian noise in addition to the measure- 
ment uncertainties. Therefore for each planet we have five free pa- 
rameters: K, zu, e, P and x- m addition to these parameters there 
are two nuisance parameters V and s, common to all the planets. 



These orbital parameters can then be used along with the stel- 
lar mass m s to calculate the length a of the semi-major axis of the 
planet's orbit around the centre of mass and the planetary mass m 
as follows: 



a s sm % = 



KPVl - e 2 
2tt 5 



(2ivG)i 



m s a s sm i 



(8) 



(9) 



(10) 



where a s is the semi-major axis of the stellar orbit about the centre- 
of-mass and i is the angle between the direction normal to the 
planet's orbital plane and the observer's line of sight. Since i can- 
not be measured with RV data, only a lower bound on the planetary 
mass m can be estimated. 



5 BAYESIAN ANALYSIS OF RADIAL VELOCITY 
MEASUREMENTS 

There are several RV search programmes looking for extrasolar 
planets. The RV measurements consist of the time t; of the ith 
observation, the measured RV Vi relative to a reference frame and 
the corresponding measurement uncertainty en . These RV measure- 
ments can be analysed using Bayes' theorem given in Eq.[T]to ob- 
tain the posterior probability distributions of the model parameters 
discussed in the previous section. We now describe the form of the 
likelihood and prior probability distributions. 



5.1 Likelihood function 

As discussed in iGregorvl J20075) . the errors on RV measurements 
can be treated as Gaussian and therefore the likelihood function can 
be written as: 



£(e)=n 



+ S 2 ) 



exp 



Me;ii)-0 2 



2(cr 2 + S 2 ) 



(ID 



where Vi and o\ are the i* 1 RV measurement and its corresponding 
uncertainty respectively, v(Q; ti) is the predicted RV for the set of 
parameters O, and s is intrinsic stellar variability. A large value 
of s can also indicate the presence of additional planets, e.g. if a 
two-planet system is analysed with a single-planet model then the 
velocity variations introduced by the second planet would act like 
an additional noise term and therefore contribute to s. 



5.2 Choice of priors 

For parameter estimation, priors become largely irrelevant once the 
data are sufficiently constraining, but for model selection the prior 
dependence always remains. Therefore, it is important that priors 
are selected base d on physical con siderations. We follow the choice 
of priors given in Gregory ( 2007a), as shown in Table|2] 
The modified Jeffreys prior, 



Pt(9\H) 



(<9 + o )m(l + (9 max /6>o)' 



(12) 



behaves like a uniform prior for 6 <C 9o and like a Jeffreys prior 
(uniform in log) for 6 3> 9o- We set Ko = so = 1 m/s and 
ifmax = 2129 m/s, which corresponds to a maximum planet-star 
mass ratio of 0.01. 
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Parameter 


Prior 


Mathematical Form 


Lower Bound 


Upper Bound 


P (days) 


Jeffreys 


l 

Pln(P max /P T i„) 


0.2 


365, 000 


K {mis) 


Mod. Jeffreys 





^max(P min /Pi) 1/3 (l/yi - e?) 


ln(l+(K max /if )(P min /P i ) 1/3 (l/\/l-ef )) 


V (m/s) 


Uniform 


1 


^max 




e 


Uniform 


1 





1 


ro (rad) 


Uniform 


1 

2tt 





2tt 


X 


Uniform 


1 





1 


s (m/s) 


Mod. Jeffreys 


(f+so) _1 
ln(l + Smax / S0 ) 





^max 



Table 2. Prior probability distributions. 



iV p A In Z Aln.Z r s (m/s) 

1 82.29 ±0.15 -1.33 ±0.13 0.42 ± 0.35 



Table 3. The evidence and jitter values for the 1 -planet simulation. 

6 APPLICATION TO SIMULATED DATA 

In this section, we apply our method to two sets of simulations, one 
with only one planet in the data and the other with two planets. Our 
aim here is to test our new methodology for Bayesian object detec- 
tion, in particular the use of the Bayesian evidence in determining 
the correct number of planets. In particula r, we analyse the same 
simulations used in iBalan & Lariavl 12009), which were obtained 
by calculating the radial velocities using for the 1 -planet and 
2-planet models respectively. Gaussian noise with /i = 0.0 m/s and 
a = 2.0 m/s was then added to the resultant radial velocities. 



Parameter 


True 


Estimate 


P (days) 


700.00 


705.09 ± 12.71 


if (m/s) 


60.00 


60.39 ±0.56 


e 


0.38 


0.38 ± 0.01 


zu (rad) 


3.10 


3.10 ± 0.03 


X 


0.67 


0.67 ±0.05 


V (m/s) 


12.00 


11.90 ±0.45 


s (m/s) 


0.00 


0.42 ± 0.35 



Table 4. True and estimated parameter values for the 1 -planet simulation. 
The estimated values are quoted as fj, ± a where fj, and a are the posterior 
mean and standard deviation respectively. 



N p A In Z Aln2 r s (m/s) 

1 41.92 ±0.14 14.82 ±0.14 7.47 ±1.13 

2 67.31 ±0.16 -1.45 ±0.13 0.51 ± 0.41 



6.1 One-planet simulation 

The evidence and jitter values obtained in the analysis of the 1- 
planet simulation are presented in Table [3] Here A In Z denotes 
the natural logarithm of evidence ratio Zn p /Zo, where Zo is the 
evidence for N p = 0. A In Z T is the natural logarithm of evidence 
ratio Zn t x /Zn c where Zn t ! and Zn t are the evidence val- 
ues for analysing the residual data, after subtracting 7V P planets, 
as discussed in Sec. [3] with 1 and planets respectively. A In Z 
therefore, gives the evidence in favour of N p planets while A In Z T 
gives the evidence in favour of there being an additional planet af- 
ter N p planets have already been found and removed from the data. 
The evidence values listed in Table [3] should be compared with the 
scale given in Table Q] It is clear that there is overwhelming evi- 
dence for the presence of 1 planet in the data. The negative A In Z r 
value further indicates that there is no evidence for the presence of 
any additional planets. Furthermore, the logarithm of the evidence 
for the 2-planet model was calculated to be 81.73 ± 0.16, which 
is lower than the logarithm of the evidence for the 1 -planet model 
listed in Table[3] providing further support for the 1-planet model. 

Adopting the 1-planet model, therefore, the resulting esti- 
mated parameter values are listed in Tableland are in excellent 
agreement with the true values used to generate the simulation. 



Table 5. The evidence and jitter values for the 2-planet simulation. 

TVp = 1, the evidence value is quite large but A In Z T gives a very 
clear indication of the presence of an additional planet. The jitter 
s for N p = 1 is also quite large. The presence of a second planet 
is confirmed by A In Z value for N p = 2, which is almost 10 In 
units higher than for N p = 1. The logarithm of the evidence for the 
3-planet model was calculated to be 66.29 ± 0.16, which is lower 
than the 2-planet model (see Table. [TJ, thus indicating a preference 
for the latter. Furthermore, both A In Z T and s for N p = 2 strongly 
suggest that no additional planet is present. Thus, adopting the 2- 
planet model, the estimated parameter values obtained are listed in 
Table [6] Once again they are in excellent agreement with the true 
values used to generate the simulation. 



7 APPLICATION TO REAL DATA 

In this section, we apply our Bayesian object detection technique 
to real RV measurements of HD 37124, 47 Ursae Majoris and HD 
10180 and compare our results with those of previous analyses of 
these systems. 



6.2 Two-planet simulation 

The evidence and jitter values obtained in the analysis of the 2- 
planet simulation are presented in Table [5] One can see that for 



7.1 HD 37124 

HD 37124 is a metal-p oor G4 dwarf star at a distance of 33 pc wit h 
mass0.85±0.02 M jButler et al.l2006l : IValenti & Fischerl2005h . 
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Planet 1 




Planet 2 


Parameter 


True 


Estimate 


True 


Estimate 


P (days) 


700.00 


708.76 ± 15.08 


100.00 


100.45 ± 0.54 


K (m/s) 


60.00 


60.35 ± 0.62 


10.00 


10.20 ± 0.62 


e 


0.38 


0.38 ± 0.02 


0.18 


0.19 ± 0.05 


vj (rad) 


3.10 


3.11 ±0.04 


1.10 


1.27 ±0.40 


X 


0.67 


0.67 ±0.06 


0.17 


0.16 ± 0.08 


V (m/s) 


12.00 


11.80 ±0.52 






s (m/s) 


0.00 


0.51 ±0.41 







Table 6. True and estimated parameter values for the 2-planet simulation. 
The estimated values are quoted as fi ± a where fj, and a are the posterior 
mean and standard deviation respectively. 



N p A In Z T s (m/s) 

1 12.04 ±0.15 13.21 ±1.43 

2 5.17 ±0.15 7.24 ±0.93 

3 -1.62 ±0.14 2.06 ±0.84 



Parameter 


HD 37124 b 


HD 37124 c 


HD 37124 d 


P (days) 


154.48 ±0.14 


853.70 ± 10.02 


2195.48 ±99.06 




(154.39) 


(855.22) 


(2156.73) 


K (m/s) 


27.73 ± 1.06 


14.16 ± 1.26 


14.52 ± 1.96 




(28.38) 


(14.15) 


(14.90) 


e 


0.07 ±0.03 


0.08 ± 0.06 


0.43 ± 0.09 




(0.10) 


(0.04) 


(0.45) 


■no (rad) 


1.41 ± 1.57 


4.07 ± 1.58 


3.47 ±0.35 




(0.70) 


(5.10) 


(3.78) 


X 


0.72 ± 0.13 


0.44 ± 0.35 


0.29 ±0.06 




(0.74) 


(0.04) 


(0.25) 


m sin i (Mj ) 


0.64 ±0.02 


0.58 ± 0.05 


0.73 ±0.07 




(0.66) 


(0.58) 


(0.75) 


a(AU) 


0.53 ± 0.00 


1.66 ±0.01 


3.11 ±0.09 




(0.53) 


(1.66) 


(3.08) 



Table 8. Estimated parameter values for the three planets found orbiting HD 
37124. The estimated values are quoted as fi±a where fj, and a are the pos- 
terior mean and standard deviation respectively. The numbers in parenthesis 
are the maximum-likelihood parameter values. 



Table 7. The evidence and jitter values for the system HD 37 1 24. 



The first planet orbiting HD 37124 was found b ylVost et alJ fcOOfJ) . 
Subs equently two furt her planets were found bv lButler et al. |<|2003j) 
and lVogtetal] j2005) res pective ly. We use the 52 RV measure- 
ments given in lVogt et al. I J2005h for our analysis. The RV data is 
plotted in Fig.Q] 

We follow the object detection methodology outlined in Sec. [3] 
and analyse the RV data, starting with A^ p = 1 and increasing it 
until the residual evidence ratio A In < 0. The resulting evi- 
dence and jitter values are presented in Table|7] We can clearly see 
N p — 3 is the favoured model, with both the residual evidence 
ratio and jitter values strongly implying no additional planets are 
contributing to the data. Adopting the 3-planet model, the estimated 
parameter values are listed in Table [8] while the 1-D marginalised 
posterior probability distributions are shown in Fig. [2] The mean 
RV curve for the 3-planet model is overlaid on the RV measure- 
ments in Fig.Q] 

Comparing our parameter values with those given in 




500 1000 1500 2000 2500 3000 

Julian day number (-2450420.047000) 

Figure 1. Radial velocity measurements, with lcr errorbars, and the mean 
fitted radial velocity curve with three planets for HD 37124. 





A\nZ T 


s (m/s) 


1 


98.27 ±0.25 


10.13 ±0.47 


2 


23.32 ±0.25 


6.19 ±0.36 


3 


4.39 ±0.25 


4.87 ±0.33 


4 


-0.77 ±0.23 


4.35 ±0.33 



Table 9. The evidence and jitter values for the system 47 Ursae Majoris. 



IVogt et alJ ( f2005h , we see that our parameter estimates for planets 
HD 37124 b and HD 37124 c are in very good agreement. However, 
our orbital time period for HD 37124 d is about 100 days lower and 
our estimated eccentrici ty is somewhat h igher. The main reason for 
this discrepancy is that lVogtetal] < l2005l) fixed the eccentricity of 
HD 37124 d at 0.2 which was chosen to ful fill the dynamical sta- 
bility requirement. Gozdziewski et al. (2006) also fitted a 3-planet 
model for HD 37124 and our parameter estimates for all three plan- 
ets are in very good agreement with theirs. 

7.2 47 Ursae Majoris 

47 Ursae Majoris is a solar analog, yello w dwarf star at a di stance 
of 14.06 pc with mass 1.06 ± 0.02 M dTakeda et al]|2007l) . The 
first planet orbiting 47 Ursae Majoris w i th an orbital period of 
1090 days was found bv lButler& Marcvl h996n . A second com- 
panion to 47 Ursae M ajors with orbital period of 2594 ± 90 days 
was discovered by iFischer et al.l d2002r) . Subsequently the com- 
bined RV data for 47 Ursae Majoris from the Lick Observatory, 
spanning 21.6 years, and from the 9.2 m Hobbly-Eberly Tele- 
scope (HET) and 12.7 m Harlam J. Smith (HJS ) telescopes of the 
McDonald Observatory dWittenmver et alj |2009). was analysed by 
Gregory & Fischei] d2010h and strong evidence was found in favour 
of a three-planet system. We analyse the same combined data-set. 
The R V data is plotted in Fig. [3] 

iGregorv & Fischea d201(jf) analysed the RV data this system 
by ignoring the residual velocity offsets associated with dewar 
changes, as well as by incorporating the dewar velocity offsets as 
additional unknown parameters, and found the results to be consis- 
tent. We therefore ignore the velocity offsets associated with dewar 
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Figure 2. 1-D marginalised posterior probability distributions for the parameters of the three planets found orbiting HD 37124. 



changes and fit for three velocity offsets Vl, Vhbt and Vhjs asso- 
ciated with Lick, HET and HJS telescopes respectively. 

We follow the object detection methodology outlined in Sec. [3] 
and analyse the RV data, starting with N p = 1 and increasing it 
until the residual evidence ratio A In Z v < 0. The resulting ev- 
idence and jitter values are presented in Table [9] We can clearly 
see N p — 4 is the favoured model, with the residual evidence ra- 
tio strongly implying no additional planets are contributing to the 
data. Our detection of the fourth planet contradicts the analysis of 
iGregorv & Fischeil J2010h . which did not find a well-defined peak 
for the fourth period using combined Lick, HET and HJS data-sets. 
They did, however, find the fourth planet using only the Lick data- 



set, but their calculated upper limit on the false alarm probability 
for the presence of the fourth planet of « 0.5 was deemed too 
high. Our detected fourth planet has the best-fit orbital period of 
369.7 days, consis t ent w ith the period of fourth planet found by 
IGregorv & Fischerj (2010) in Lick-only data. Nonetheless, this pe- 
riod is suspiciously close to one year, indicating that it might be 
an artefact of the data reduction. We therefore discuss the results 
obtained from the 3-planet model in the rest of this section. 

Adopting the 3-planet model, the estimated parameter values 
are listed in Table [Tol while the 1-D marginalised posterior prob- 
ability distributions are shown in Fig. [4] The mean RV curve for 
the 4-planet model is overlaid on the RV measurements in Fig. [3] 
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4 4.5 5 5.5 6 

s (m/s) 

Figure 4. 1-D marginalised posterior probability distributions for the parameters of the three planets found orbiting 47 Ursae Majoris. 
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Parameter 


47 UMa b 


47 UMa c 


47 UMa d 


P (days) 


1078.26 ± 1.83 


2293.17 ± 79.39 


14674.55 ± 5925.37 




/ 1 (170 co 1 ! 

(10/8.69) 


(ZZZO.bl j 


(1/21 / .04 J 


K (m/s) 


49.49 ± 1.53 


8.49 ± 1.30 


13.52 ± 1.09 




(01. Zz ) 


/in i q\ 
(1U. lo ) 


/ 1 Q A 0\ 

(!o.4Z ) 


e 


0.03 ± 0.01 


0.32 ± 0.18 


0.24 ± 0.16 




/n (\a\ 
(U.U4 ) 


/n c; c; \ 
(U.OO j 


/n Qfi'i 
(U.oO j 


■no (rad) 


4.32 ± 0.74 


2.95 ± 1.32 


2.37 ± 2.37 




(4.29) 


(2.42) 


(0.32) 


X 


0.39 ± 0.11 


0.64 ± 0.28 


0.58 ± 0.19 




(0.41) 


(0.75) 


(0.69) 


m sin i (Mj) 


2.59 ±0.09 


0.53 ± 0.05 


1.58 ±0.17 




(2.71) 


(0.57) 


(1.66) 


o(AU) 


2.10 ±0.02 


3.48 ± 0.08 


11.81 ±2.99 




(2.11) 


(3.43) 


(13.40) 



Table 10. Estimated parameter values for the three planets found orbiting 47 Ursae Majoris. The estimated values are quoted as n ± a where /i and a are the 
posterior mean and standard deviation respectively. The numbers in parentheses are the maximum-likelihood parameter values. 



100 




1000 2000 3000 4000 5000 6000 7000 8000 
Julian day number (-2446959.7372) 

Figure 3. Radial velocity measurements, with la errorbars, and the mean 
fitted radial velocity curve with three planets for 47 Ursae Majoris. 



There is fairly good a greement between our para meter constraints 
and those presented by Greg ory & Fischeil fcOlfjt) . 



7.3 HD 10180 

HD 10180 is a Gl V type star at a d istance of 39 pc with mass 
1.06 ± 0.05 Mq jLoviset all I201CI). Using the R V data from 
HARPS instrument ( lMavorll2003l) . lLovis et"al] d2010t) recently re- 
ported at least five and as many as seven planets orbiting this star. 
There has been much interest in the possible sev enth planet as its 
minimum mass as reported bv lLovisetai] 12010) is 1.4 A/0. We 
analyse the same HARPS data-set after subtracting a mean radial 
velocity of 3.55302 km/s from it. The resultant RV data is plotted 
in Fig. [6] 

The evidence and jitter values are presented in Table [TT] We 
can clearly see N p = 6 is the favoured model, with the residual evi- 
dence ratio strongly implying that the residual data consists of noise 
only. Adopting the 6-planet model, the estimated parameter values 
are listed in Table [T2l while the 1-D marginalised posterior proba- 
bility distributions are shown in Fig. [5] The mean RV curve for the 
6-planet model is overlaid on the RV measurements in Fig. [6] It can 





AlnZ r 


s (m/s) 


1 


24.84 ±0.17 


5.64 ± 0.29 


2 


9.46 ±0.18 


4.55 ± 0.23 


3 


63.47 ±0.17 


3.96 ± 0.20 


4 


45.47 ±0.17 


2.45 ± 0.13 


5 


4.49 ±0.17 


1.58 ± 0.09 


6 


-0.73 ±0.17 


1.36 ± 0.07 



Table 11. The evidence and jitter values for the system HD 10180. 



be seen that our orbital parameters ar e in general reaso nably good 
agree ment with the one s presented in lLovis et al. I fcoioh . 

lLovis et al. I d2010h found fairly strong peaks with periods 
1.178 and 6.51 days in the periodogram of the residuals of the 6- 
planet Kaplerian model. They noted that these two peaks are aliases 
of each other with 1 sidereal day period ( j 1/6.51 — 1.0027| ~ 
1/1.178). Arguing that it is unlikely for the system to be dynami- 
cally stable with two planets having P = 5.76 days and P = 6.51 
days, they concluded that if the 7th signal is caused by a planet, it 
is likely to have P = 1.178 days. Meanwhile, they were not able 
to rule out conclusively or confirm the presence of the 7th planet. 
Our analysis of the residual data of the 6-planet model did reveal 
several peaks in the posterior distribution with periods around 6.51 
and 1 days, but as can be seen from the value of residual evidence 
in Tab. [TT] they were not found to be sufficiently significant. We 
therefore rule out the presence of any additional planets contribut- 
ing to the RV data. 



8 CONCLUSIONS 

We have presented a new and efficient method to detect extrasolar 
planets from RV measurements. Our method is not only able to 
fit for a specific number of planets, but can also infer the number 
of planets from the data using Bayesian model selection. We have 
successfully applied our method to simulated data-sets, as well as 
to the real systems HD 37124, 47 Ursae Majoris and HD 10180. 
Our method can potentially identify many undiscovered extrasolar 
planets in existing RV data-sets. One drawback of our method is 
that it ignores the planet-planet interactions, but these interactions 
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Figure 5. 1-D marginalised posterior probability distributions for the parameters of the six planets found orbiting HD 10180. 
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Parameter 


HD 10180 b 


HD 10180 c 


HD 10180 d 


HD 10180 e 


HD 101801' 


HD 10180 g 


P (days) 


5.76 ± 0.02 


16.35 ± 0.05 


49.74 ±0.20 


122.75 ± 0.54 


600.17 ±13.75 


2266.22 ±412.42 




(5.76) 


(16.36) 


(49.74) 


(122.69) 


(601.88) 


(2231.44) 


K(m/s) 


4.54 ±0.12 


2.89 ±0.13 


4.28 ± 0.14 


2.91 ±0.14 


1.43 ±0.20 


3.06 ±0.16 




(4.63) 


(2.94) 


(4.25) 


(2.70) 


(1.79) 


(2.98) 


e 


0.07 ±0.03 


0.13 ± 0.04 


0.03 ± 0.02 


0.09 ± 0.04 


0.15 ±0.09 


0.09 ± 0.05 




(0.08) 


(0.12) 


(0.03) 


(0.08) 


(0.25) 


(0.05) 


■no (rad) 


2.60 ± 0.38 


2.62 ±0.35 


2.56 ± 0.16 


2.65 ±0.53 


3.08 ± 0.97 


2.89 ± 2.60 




(2.51) 


(2.49) 


(5.12) 


(2.95) 


(2.43) 


(5.98) 


X 


0.22 ± 0.06 


0.35 ± 0.06 


0.43 ± 0.27 


0.23 ±0.11 


0.31 ±0.28 


0.67 ±0.10 




(0.24) 


(0.37) 


(0.83) 


(0.16) 


(0.27) 


(0.73) 


msini (Mj) 


0.04 ± 0.00 


0.04 ±0.00 


0.08 ± 0.00 


0.07 ± 0.00 


0.06 ± 0.00 


0.20 ±0.01 




(0.04) 


(0.04) 


(0.08) 


(0.07) 


(0.07) 


(0.20) 


a(AU) 


0.06 ± 0.00 


0.13 ± 0.00 


0.27 ±0.00 


0.49 ± 0.00 


1.42 ± 0.03 


3.45 ±0.16 




(0.06) 


(0.13) 


(0.27) 


(0.49) 


(1.42) 


(3.40) 



Table 12. Estimated parameter values for the six planets found orbiting HD 10180. The estimated values are quoted as fi ± a where fi and a are the posterior 
mean and standard deviation respectively. The numbers in parenthesis are the maximum-likelihood parameter values. 

e.g. by jointly analysing the RV data and light curves for the same 
system. This would enable us to place better constraints on the plan- 
etary parameters and also to learn about the physical structure of 
the planets. Once again our basic analysis technique can be easily 
extended to perform a joint analysis of data sets of different types 
We plan to extend our approach by incorporating light curve data 
in a forthcoming paper. 
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Figure 6. Top panel shows the radial velocity measurements (after subtract- 
ing mean RV of 3.55302 km/s), with lc errorbars. Bottom panel shows a 
blow-up of the mean fitted radial velocity curve with six planets for HD 
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are important only for a very small fraction of planetary systems. 
Moreover, our basic methodology can be extended to include such 
interactions. This will be undertaken in further work. 

Another important avenue of research in extrasolar planet 
searches is to perform a coherent analysis using different data-sets, 
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